Try Textual Inversion
https://gyazo.com/461f9aa8415fdc0e45e20a4eb9d42dec
learning data
https://gyazo.com/760f4eb8f518290288ed0a87b9b26e1a
AI-generated photo, AI-generated Monet-style painting
https://gyazo.com/ffa21fa2d8ab09a2a76d41459efcbef9https://gyazo.com/29787da6b19012fc721907eeb6f3a285
By the way, the prompt is something like "a photo of our cat" or "a painting of our cat by Claude Monet", but if you change the "our cat" part to "cat", you will see the following. It is more complete as a cat, but the characteristics of "our cat" are not so good.
https://gyazo.com/a280b7afa562217920b6b3e6427b1a0dhttps://gyazo.com/960a4befc7d6f2f8c0d4ca6f3fbebc76
Maybe my cat is a type of structure in the three coat colors of black, orange, and white, where the black pigment is lost and the orange is much lighter.
https://gyazo.com/73cf1b06b231de0dcb32f31cba3ef74ehttps://gyazo.com/29bbe5396e9c4a16f1ac0a765b4af8e4
The file of embedded vectors generated by Textual Inversion is about 5KB. The main content is a 768-dimensional float vector with some detailed information about the token.
---Impressions
@nishio: I'm feeling "not much resemblance" at the moment, but compared to random cat photos, it has clearly acquired features, so I have a feeling that within a few years there will be a lot of [People who keep messing around in search of a face. I have a feeling that there will be many [People who keep messing around in search of a face. For example, if you study the photos of your daughter who died prematurely and generate hundreds of photos every day and select the ones you like, you will create new photos of your "[It lives in my heart. [Commemorative photo at a sightseeing spot you've never been to, field day photos, wedding photos...
This is a virtual souvenir photo of my cat, a completely indoor cat, when I took her to the virtual ocean!
https://gyazo.com/f71c768dc578e7a8e678af69f9b76bc6
"Wedding Photography."
Ah, so you could generate "your idea of an ideal son-in-law", match them up, marry them, and then start generating pictures of "grandchildren" who never existed...
This "virtual reality" sounds like a bad idea. If there is demand, there will be providers, and the tragedy of losing virtual grandchildren when the providers go out of business... His daughter, who died young and came of age in the Metaverse, is locked in by Meta (hell).
My virtual daughter and son-in-law are raising their non-existent grandchildren in a beautiful non-existent house by a non-existent lake while subsisting on a non-existent farm, all locked in by Meta, and the maintenance fees are deducted from my account on a subscription model. When I thought I hadn't logged in recently, the person was dead, but I hadn't cancelled the account, so it keeps getting debited (hell).
I saw the response of "learning a guesser's photo" and thought that there could be hell even if the subject is still alive. It seems like there is a large amount of training data and it would be easy to improve the quality of the face. The person is growing up, but the growth is stopped when the person says, "No, I like the one I had when I was 20 years old" and is kept forever in the metaverse. Breeding [idol
There are going to be hundreds of people who will remove the porn filter.
---
Bowman
https://scrapbox.io/files/6323fdeeff937700225f1963.png
https://scrapbox.io/files/6323fdf1ff25a80021c2e9e2.png
I got a very good one! I was so excited, but this was the best case, and even after generating more than 100 sheets after that, I could not produce anything better than this!
https://gyazo.com/d3467678eae4ca379c5af5118ec9128a
Interpreted as "Bowmen usually have a local dish." w
There are too many outputs where the food is the main body. You said "it's a CHARACTER" when you were learning.
In fact, this was the first experiment, and after I got excited and left it for a while, I decided "let's try it in live action", which is the cat experiment above.
live-action Bowman
https://gyazo.com/33ea488b2647313b64c54627f439a519
https://gyazo.com/bd735fcae08827967d2e32798ea3075a
It's technically interesting that they have mastered various things such as "texture", "colors they tend to use", "CO-like logo", "size against people", etc. from the images I gave them with no prior information... but I guess consumers won't be satisfied with this quality, right?
Results can be seed sensititve. If you're unsatisfied with the model, try re-inverting with a new seed (by adding --seed <#> to the prompt).
You may or may not get a good one if you run the gacha 100 times for an hour at a time.
Since what we get as a result of learning is a single vector of 768 dimensions, we may be able to search efficiently by selecting only the good ones among multiple vectors and averaging or GA
Optimization problem in 768 dimensional space where the evaluation function is human after all.
My cat's learning will be at a satisfactory level if I work hard, and I have a feeling that Bowman won't be able to do it.
Bowman can't be represented by one token, he would need to be represented by about three tokens, for example, a face, a logo, and an outfit.
Learning with Unexplored Logos
https://scrapbox.io/files/632401548ada340022341544.png
https://scrapbox.io/files/63240156cb72b60022b12ec9.jpg
I tried changing the background to show the logo image part, but it still doesn't seem to work.
I guess you'd have to take a picture of a logo shaped 3D object placed in various locations.
It sounds like they understood it as "an abstract image with greenish, diagonal and horizontal lines" rather than "an unexplored logo."
---
This page is auto-translated from /nishio/Textual Inversionを試してみる using DeepL. If you looks something interesting but the auto-translated English is not good enough to understand it, feel free to let me know at @nishio_en. I'm very happy to spread my thought to non-Japanese readers.